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BILEVEL POLYNOMIAL PROGRAMS AND SEMIDEFINITE 
RELAXATION METHODS 

JIAWANG NIE, LI WANG, AND JANE J. YE 


Abstract. A bilevel program is an optimization problem whose constraints 
involve the solution set to another optimization problem parameterized by up¬ 
per level variables. This paper studies bilevel polynomial programs (BPPs), 
i.e., all the functions are polynomials. We reformulate BPPs equivalently as 
semi-infinite polynomial programs (SIPPs), using Fritz John conditions and Ja¬ 
cobian representations. Combining the exchange technique and Lasserre type 
semidefmite relaxations, we propose a numerical method for solving bilevel 
polynomial programs. For simple BPPs, we prove the convergence to global 
optimal solutions. Numerical experiments are presented to show the efficiency 
of the proposed algorithm. 


1. Introduction 


We consider the bilevel polynomial program (BPP): 

{ F* := min Fix. y) 

xeR",yeRJ’ 

s.t. Gi(x,y) > 0, i = 1, • ■ ■ ,toi, 
y e S(x), 

where F and all Gi are real polynomials in (x,y), and S(x) is the set of global 
minimizers of the following lower level program, which is parameterized by x , 


( 1 . 2 ) 


mm 


f(x,z) s.t. gj{x,z) > 0, j = 1, • • • ,m 2 . 

In (1.2), / and each g 3 are polynomials in (x,z). For convenience, denote 
Z(x) := {z G R p | gj(x, z) > 0, j = 1, • • • ,m 2 }, 


the feasible set of (1.2). The inequalities Gi(x,y) > 0 are called upper (or outer) 
level constraints, while g 3 (x,z) > 0 are called lower (or inner) level constraints. 
When mi = 0 (resp., m 2 = 0), there are no upper (resp., lower) level constraints. 
Similarly, F(x,y) is the upper level (or outer) objective, and f(x,z ) is the lower 
level (or inner) objective. Denote the set 


(1.3) 


U := { (x,y) 


Gi(x,y) > 0 (i = 1,- • • ,mi), 

9j{x,y) > 0 (J = 1 , - - - ,m 2 ) 

Then the feasible set of (P) is the intersection 

(1.4) Un{{x,y) : y G S(x)}. 
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Throughout the paper, we assume that for all ( x , y) £li, S(x) ^ 0 and consequently 
the feasible set of (P) is nonempty. When the lower level feasible set Z(x) = Z 
is independent of x, we call the problem (P) a simple bilevel polynomial program 
(SBPP). The SBPP is not mathematically simple but actually quite challenging. 
SBPPs have important applications in economics, e.g., the moral hazard model of 
the principal-agent problem [23j ■ When the feasible set of the lower level program 
Z(x) depends on x, the problem (P) is called a general bilevel polynomial program 
(GBPP). GBPP is also an effective modelling tool for many applications in various 
fields; see e.g. min] and the references therein. 


1.1. Background. The bilevel program is a class of difficult optimization prob¬ 
lems. Even for the case where all the functions are linear, the problem is NP-hard 
[3] . A general approach for solving bilevel programs is to transform them into single 
level optimization problems. A commonly used technique is to replace the lower 
level program by its Kurasli-Kuhn-Tucker (KKT) conditions. When the lower level 
program involves inequality constraints, the reduced problem becomes a so-called 
mathematical program with equilibrium constraints (MPEC) [221 32], If the lower 
level program is nonconvex, the optimal solution of a bilevel program may not even 
be a stationary point of the reduced single level optimization problem by using 
the KKT conditions. This was shown by a counter example due to Mirrlees [23] . 
Moreover, even if the lower level program is convex, it was shown in [10] that a local 
solution to the MPEC obtained by replacing the lower level program by its KKT 
conditions may not be a local solution to the original bilevel program. Recently, [I] 
proposed to replace the lower level program with its Fritz John conditions instead 
of its KKT conditions. However, it was shown in m that the same difficulties re¬ 
main, i.e., solutions to the MPEC obtained by replacing the lower level program by 
its Fritz John conditions may not be the solutions to the original bilevel program. 

An alternative approach for solving BPPs is to use the value function 133 m, 
which gives an equivalent reformulation. However, the optimal solution of the 
bilevel program may not be a stationary point of the value function reformulation. 
To overcome this difficulty, [43] proposed to combine the KKT and the value func¬ 
tion reformulations. Over the past two decades, many numerical algorithms were 
proposed for solving bilevel programs. However, most of them assume that the 
lower level program is convex, with few exceptions [Ml [2S1 [Ml133 EE1 [3H SO]- in 
|2fill26l . an algorithm using the branch and bound in combination with the exchange 
technique was proposed to find approximate global optimal solutions. Recently, the 
smoothing techniques were used to find stationary points of the valued function or 
the combined reformulation of simple bilevel programs [Ml M 1M11101- 

in general, it is quite difficult to find global minimizers of nonconvex optimization 
problems. However, when the functions are polynomials, there exists much work on 
computing global optimizers, by using Lasserre type semidefinite relaxations HZj. 
We refer to [18j EE] for the recent work in this area. Recently, Jeyakumar, Lasserre, 
Li and Pham nn worked on simple bilevel polynomial programs. When the lower 
level program (1.21 is convex for each fixed x, they transformed (1.1) into a sin¬ 
gle level polynomial program, by using Fritz John conditions and the multipliers to 
replace the lower level program, and globally solving it by using Lasserre type relax¬ 
ations. When (1.2) is nonconvex for some x, by approximating the value function 
of lower level programs by a sequence of polynomials, they propose to reformulate 
0 > with approximate lower level programs by the value function approach, and 
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globally solving the resulting sequence of polynomial programs by using Lasserre 
type relaxations. The work m is very inspiring, because polynomial optimiza¬ 
tion techniques were proposed to solve BPPs. In this paper, we also use Lasserre 
type semidefinite relaxations to solve BPPs, but we make different reformulations, 
by using Jacobian representations and the exchange technique in semi-infinite pro¬ 
gramming. 


1.2. From BPP to SIPP. A bilevel program can be reformulated as a semi¬ 
infinite program (SIP). Thus, the classical methods (e.g., the exchange method 
BEZUS]) for SIPs can be applied to solve bilevel programs. For convenience of 
introduction, at the moment, we consider SBPPs, i.e., the feasible set Z(x) = Z in 
(1.2) is independent of x. 

Before reformulating BPPs as SIPs, we show the fact: 


(1.5) y £ S(x) y £ Z, H(x,y,z) > 0 (V z G Z), 

where H(x,y,z) := f(x,z) — f(x,y). Clearly, the “=>” direction is true. Let us 
prove the reverse direction. Let v(x) denote the value function: 

(1.6) v(x) := inf f(x,z). 


If (x,y) satisfies the right hand side conditions in (1.5), then 
inf H(x, y, z) = v(x) - f(x, y) > 0. 

2GZ 


Since y £ Z, we have v(x) — f(x,y) < 0. Combining these two inequalities, we get 

v(x) = mi f(x,z) = f(x, y) 

z£Z 


and hence y £ S(x). 

By the fact the problem ( P) 

f F* := min 


(1.7) (P) : 


jeGR 71 , y(ziZ 

S.t. 


is equivalent to 

F(x,y) 

Gi(x,y) >0, i = 1,... ,toi, 
H(x , y, z) > 0, V z £ Z. 


The problem (P) is a semi-infinite polynomial program (SIPP), if the set Z is 
infinite. Hence, the exchange method can be used to solve (P). Suppose Zj. is a 
finite grid of Z. Replacing Z by Zk in (P), we get: 

{ Ft := min F(x,y) 
k zeK ",3/ez v y 

s.t. Gi(x,y) >0, i = 1,... ,mi, 

H(x, y, z) > 0 , V z £ Z k . 

The feasible set of (P k ) contains that of (P). Hence, 


p* < p* 

Since Z k is a finite set, ( P k ) is a polynomial optimization problem. If, for some Z k , 
we can get an optimizer ( x k ,y k ) of (P&) such that 

(1.9) v(x k ) — f(x k , y k ) > 0, 
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then y k G S(x k ) and ( x k ,y k ) is feasibl e for (P). In such case, (x k ,y k ) must be a 
global optimizer of (P). Otherwise, if (1.9) fails to hold, then there exists z k € Z 
such that 

f(x k ,z k )-f(x k ,y k )< 0. 

For such a case, we can construct the new grid set as 

Zk+i '■= Zk U {z k }, 


and then solve the new problem (Pfc +1 ) with the grid set Zk+i- Repeating this 
process, we can get an algorithm for solving (P) approximately. 

How does the above approach work in computational practice? Does it converge 
to global optimizers? Each subproblem (Pk) is a polynomial optimization problem, 
which is generally nonconvex. Theoretically, it is NP-hard to solve polynomial 
optimization globally. However, in practice, it can be solved successfully by Lasserre 
type semidefinite relaxations (cf. [TT] [IB]). Recently, it was shown in [30] that 
Lasserre type semidefinite relaxations are generally tight for solving polynomial 
optimization problems. About the convergence, we can see that {F^} is a sequence 
of monotonically increasing lower bounds for the global optimal value F*, i.e., 


F* < 


< F* < P* +1 < 


< F* 


By a standard analysis for SIP (cf. (23); one can expect the convergence F£ —» F*, 
under some conditions. However, we would like to point out that the above exchange 
process typically converges very slowly for solving BPPs. A major reason is that 
the feasible set of ( Pk) is much larger than that of (P). Indeed, the dimension of the 
feasible set of ( Pk ) is typically larger than that of (P). This is because, for every 
feasible ( x, y) in (P), y must also satisfy optimality conditions for the lower level 
program (1.2). In the meanwhile, the y in (Pk) does not satisfy such optimality 
conditions. Typically, for (Pk) to approximate (P) reasonably well, the grid set Zk 
should be very big. In practice, the above standard exchange method is not efficient 
for solving BPPs. 


1.3. Contributions. In this paper, we propose an efficient computational method 
for solving BPPs. First, we transform a BPP into an equivalent SIPP, by using Fritz 
John conditions and Jacobian representations. Then, we propose a new algorithm 
for solving BPPs, by using the exchange technique and Lasserre type semidefinite 
relaxations. 

For each (x,y) that is feasible for (1.1), y is a minimizer for the lower level 
program parameterized by x. If some constraint qualification conditions are 
satisfied, the KKT conditions hold. If such qualification conditions fail to hold, the 
KKT conditions might not be satisfied. However, the Fritz John conditions always 
hold for ( |1.2| ) (cf. [6J §3.3.5] and [5] for optimality conditions for convex programs 


without constraint qualifications). So, we can add the Fritz John conditions to (P), 
while the problem is not changed. A disadvantage of using Fritz John conditions is 
the usage of multipliers, which need to be considered as new variables. Typically, 
using multipliers will make the polynomial program much harder to solve, because 
of new additional variables. To overcome this difficulty, the technique in [28j §2] can 
be applied to avoid the usage of multipliers. This technique is known as Jacobian 
representations for optimality conditions. 
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The above observations motivate us to solve bilevel polynomial programs, by 
combining Fritz John conditions, Jacobian representations, Lasserre relaxations, 
and the exchange technique. Our major results are as follows: 


Unlike some prior methods for solving BPPs, we do not assume the KKT 
conditions hold for the lower level program (1.2). Instead, we use the Fritz 
John conditions. This is because the KKT conditions may fail to hold for 
the lower level program (1.21, while the Fritz John conditions always hold. 
By using Jacobian representations, the usage of multipliers can be avoided. 
This greatly improves the computational efficiency. 

For simple bilevel polynomial programs, we propose an algorithm using 
Jacobian representations, Lasserre relaxations and the exchange technique. 
Its convergence to global minimizers is proved. The numerical experiments 
show that it is efficient for solving SBPPs. 

For general bilevel polynomial programs, we can apply the same algorithm, 
using Jacobian representations, Lasserre relaxations and the exchange tech¬ 
nique. The numerical experiments show that it works well for some GBPPs, 
while it is not theoretically guaranteed to get global optimizers. However, 
its convergence to global optimality can be proved under some assumptions. 


The paper is organized as follows: In Section [2j we review some preliminaries in 
polynomial optimization and Jacobian representations. In Section [3j we propose a 
method for solving simple bilevel polynomial programs and prove its convergence. 
In Section [4j we consider general bilevel polynomial programs and show how the 
algorithm works. In Section [5j we present numerical experiments to demonstrate 
the efficiency of the proposed methods. In Section |6j we make some conclusions 
and discussions about our method. 


2. Preliminaries 

Notation. The symbol N (resp., R ,C) denotes the set of nonnegative integers 
(resp., real numbers, complex numbers). For an integer n > 0, [n] denotes the set 
{1, • • • , n}. For x := (xi,..., x n ) and a := (op,..., a n ), denote the monomial 

-a — “i ... 

* 

For a finite set T, |T| denotes its cardinality. The symbol R[x] := R[xi,--- , x„] 
denotes the ring of polynomials in x := (xi, ■ ■ • ,x n ) with real coefficients whereas 
R[x]fc denotes its subspace of polynomials of degree at most k. For a polynomial 
p £ R[x], define the set product 

p ■ R[x] := {pq I q £ R[x]}. 

It is the principal ideal generated by p. For a symmetric matrix W, W 0 (resp., 
>- 0) means that W is positive semidefinite (resp., definite). For a vector u £ R™, 
||u|| denotes the standard Euclidean norm. The gradient of a function f{x) is 
denoted as V/(x). If /(x, z) is a function in both x and 2 , then V-/(x, z) denotes 
the gradient with respect to 2 . For an optimization problem, argmin denotes the 
set of its optimizers. 

2.1. Polynomial optimization. An ideal / in R[x] is a subset of R[x] such that 
I ■ R[x] C / and I + I Cl. For a tuple p = (pi, ■ ■ ■ ,p r ) in R[x], I(jp) denotes the 
smallest ideal containing all pi, i.e., 

I(p) = pi ■ R[x] H- \-p r ■ R[x]. 
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The kth truncation of the ideal I{p), denoted as Ik{p), is the set 

Pl • Rfxj/c—degfp;,) + • • • + p r * deg(p r ) • 

For the polynomial tuple p 7 denote its real zero set 

V{p) := {u £ R” | p(v) = 0}. 


A polynomial cr £ R[x] is said to be a sum of squares (SOS) if a = a\ + • • • + a\ 
for some ai,..., a* £ R[x]. The set of all SOS polynomials in x is denoted as £[x]. 
For a degree m, denote the truncation 


£[x] m := E[x] O R[x] m . 

For a tuple q = (q ±,..., qt), its quadratic module is the set 

Q(q) ■= £[x] + qi ■ £[x] H-b q t ■ E[x]. 

The k -th truncation of Q(q) is the set 

S[x] 2 fc + qi ■ E[x] dl H-1 -q t - £[x] dt 

where each di = 2k — deg(q,). For the tuple q, denote the basic semialgebraic set 

5(g) := {u € I q(v) > 0}. 


For the polynomial tuples p and q as above, if / £ I(p) + Q(q ), then clearly 
/ > 0 on the set V(p) O 5(g). However, the reverse is not necessarily true. The 
sum I{p) + Q(q) is said to be archimedean if there exists b £ I(jp) + Q(q) such 
that 5(6) = {u £ R" : b(v) > 0} is a compact set in R n . Putinar [.‘j.'jj proved 
that if a polynomial / > 0 on V(p) O S(q) and if I(jp) + Q(q) is archimedean, 
then / £ I(p) + Q{q)- When / is only nonnegative (but not strictly positive) on 
V(p) fl5(g), we still have / £ I(p) + Q(q ), under some general conditions (cf. [30]). 

Now, we review Lasserre type semidefinite relaxations in polynomial optimiza¬ 
tion. More details can be found in nans eg. Consider the general polynomial 
optimization problem: 


( 2 . 1 ) 



min fix) 

xeR n J v ' 

s.t. p(x) = 0, q(x) > 0, 


where / G M[x] and p,q are tuples of polynomials. The feasible set of is 

precisely the intersection V(p) n S(q). The Lasserre’s hierarchy of semidefinite 
relaxations for solving (2.1) is (fc = 1,2,...): 


( 2 . 2 ) 


fk '■= max 

s.t. 


/ -7 e hk{p) + Qk(q )• 


When the set I(p) + Q{q) is archimedean, Lasserre proved the convergence 


fk —t /min) aS k —> OO. 


If there exist k < oo such that fk = fmin, the Lasserre’s hierarchy is said to have 
finite convergence. Under the archimedeanness and some standard conditions in op¬ 
timization known to be generic (i.e., linear independence constraint qualification, 
strict complementarity and second order sufficiency conditions), the Lasserre’s hi¬ 
erarchy has finite convergence. This was recently shown in EDI- On the other hand, 
there exist special polynomial optimization problems for which the Lasserre’s hier¬ 
archy fails to have finite convergence. But, such special problems belong to a set 
of measure zero in the space of input polynomials, as shown in EDI- Moreover, we 
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can also get global minimizers of ( 2 . 1 ) by using the flat extension or flat truncation 
condition (cf. [25]). The optimization problem (2.2) can be solved as a semidefinite 
program, so it can be solved by semidefinite program packages (e.g., SeDuMi 55] , 
SDPT3 [IT]). A convenient and efficient software for using Lasserre relaxations is 

GloptiPoly 3 jl5j . 


2.2. Jacobian representations. We consider the polynomial optimization prob¬ 
lem that is similar to the lower level program ( 1 . 2 ): 

(2.3) 


mm 

zeRp 


f(z) S.t. gi(z) > 0, ...,g m (z) > 0, 


where f,gi,-.-,g m G R[;z] := R[zi,..., z p \. Let Z be the feasible set of ( |2.3[ ). For 
z £ Z, let J(z) denote the index set of active constraining functions at z. 

Suppose z* is an optimizer of (2.3). By the Fritz John condition (cf. HI §3-3.5]), 
there exists (/xq, Hi,..., Hm) 7 ^ 0 such that 


(2.4) MoV/C 2 *) ~ y ^IMVgijz*) = 0, mgi(z*) = 0 (i e[m]). 

i =1 


A point like z* satisfying (2.4) is called a Fritz John point, 
active constraints, the above is then reduced to 


If we only consider 


(2.5) 


W) v/OO - 


E 

££ J(z*) 


mv gi {z*) = 0 . 


The condition (2.4) uses multipliers /.io,... ,/i m , which are often not known in ad¬ 
vance. If we consider them as new variables, then it would increase the number of 
variables significantly. For the index set J = {*i,..., i/~}, denote the matrix 


B[J,z]:=[Vf(z) V<fe(z) 


Vg ik (zj\ . 


Then condition (2.5) means that the matrix B[J(z*),z*] is rank deficient, i.e., 

ia.uk B[J(z*), z*] < |J(z*)|. 


The matrix B[J(z*), z*] depends on the active set J(z *), which is typically unknown 
in advance. 

The technique in [2H] §2] can be applied to get explicit equations for Fritz John 
points, without using multipliers /x^. For a subset J = {i\,... ,ik} C [m] with 
cardinality |J| < min{TO,p— 1}, write its complement as J c := [m]\J. Then 


B[J, z] is rank defincient all (k + 1) x (A; + 1) minors of B[J, z] are zeros. 

There are totally equations defined by such minors. However, this number can 
be significantly reduced by using the method in [281 §2]- The number of equations, 
for characterizing that B[J , z] is rank defincient, can be reduced to 

£{J) := p(fc+l)-(fc + l) 2 + l. 

It is much smaller than ( fc ^ 1 ). For cleanness of the paper, we do not repeat the 
construction of these minimum number defining polynomials. Interested readers 
are referred to [28] §2] for the details. List all the defining polynomials, which 
make B[J,z] rank deficient, as 

(2.6) ?7i J ,...,?7/ (J) . 
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Consider the products of these polynomials with cfy ’s: 

(2.7) v( • ( ( ll <!.,)■ <j)-(.n a gj). 


They are all polynomials in z. The active set J(z) is undetermined, unless z is 
known. We consider all possible polynomials as in (2.7), for all J C [m], and collect 
them together. For convenience of notation, denote all such polynomials as 


( 2 . 8 ) 


i>i, •••> V’l, 


where the number 

L = J2 £(J) 

JC[m],| J|<mi n {m,p—1} 

Y ) ( p ( k + 1 )~( k + + !) ■ 

0<k<min{m,p— 1} ' 

When m, k are big, the number L would be very large. This is an unfavorable 
feature of Jacobian representations. 

We point out that the Fritz John points can be characterized by using the poly¬ 
nomials r/q,..., i/jl- Define the set of all Fritz John points: 

<£ [m]), | 

Let W be the set of real zeros of polynomials ifj(z), i.e., 

(2.10) W = {z £ | Mz) = ■ • • = ^l(z) = 0}. 


(2.9) 


Kfj ■= { z £ 


3(M(L Ml 5 ••• 5 M rn ) 7 ^ O5 (^) — 0 (^ 
m 

MoV/(z) - M*V5i(z) = 0. 

i=1 


It is interesting to note that the sets Kpj and W are equal. 


Lemma 2.1. For Kfj, W as in 


( 2.9)- (2A0 ), it holds that Kpj = W. 


Proof. First, we prove that W C Kpj- Choose an arbitrary u £ W, and let J(u) be 
the active set at u. If | J(it)| > p, then the gradients V/(u) and Vgj(u ) ( j £ J(u )) 
must be linearly dependent, so u £ Kpj- Next, we suppose |J(w)| < p. Note 
that gj(u) > 0 for all j £ J{u) c . By the construction, some of if>i,... are the 
polynomials as in (2.7) 

• f n g 0 

\jeJ(ur 


Vt 


J(v 


)■ 


Thus, = 0 implies that all the polynomials vanish at u. By their defini¬ 
tion, we know the matrix B[J(u),u] does not have full column rank. This means 
that u £ Kpj- 

Second, we show that Kfj Q W. Choose an arbitrary u £ Kpj- 


• Case I: J(u) = 0. Then V/(u) = 0. The first column of matrices B[9),u] is 
zero, so all rjf and 'tpj vanishes at u and hence u £ W. 

• Case II: J(u ) ^ 0. Let I C [m] be an arbitrary index set with |/| < 
min{m,p— 1}. If J(u) 2 7, then at least one j £ I c belongs to J(u). Thus, 
at least one j £ I c satisfies gj{u) = 0, so all the polynomials 


Vt 



vanish at u. If J(u) C /, then Mi 9i( u ) = 0 implies that Mi = 0 for all i £ I c . 
By definition of Kpj, the matrix B[I,u] does not have full column rank. 
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So, the minors r]j of B[I,u] vanish at u. By the construction of ipi, we 
know all ^ vanish at u, so u £ W. 

The proof is completed by combining the above two cases. □ 

3. Simple bilevel polynomial programs 

In this section, we study simple bilevel polynomial programs (SBPPs) and give 
an algorithm for computing global optimizers. For SBPPs as in ( |1.1[ ), the feasible 
set Z(x) for the lower level program (1.2) is independent of x. Assume that Z(x) 
is constantly the semialgebraic set 

(3.1) Z := {z G W | 5i(z) > 0,...,g ma (z) > 0}, 

for given polynomials gi,..., g m2 in z := (zi,... , z p ). For each pair (x , y) that is 


feasible in (1.1), y is an optimizer for (1.2) which now becomes 
(3.2) min f(x,z) s.t. g ± (z) > 0,... ,g m2 (z) > 0. 

Note that the inner objective / still depends on x. So, y must be a Fritz John point 


of (3.2), i.e., there exists (/xo, Hi, ■ ■ ■ , Hm 2 ) ^ 0 satisfying 
/ioV z /(x,y) - 




*9j{y) = 0, fj,jgj(y) = 0(je [m 2 ]). 


Let Kpj(x) denote the set of all Fritz John points of (3.2). The set Kpj(x ) can 
be characterized by Jacobian representations. Let i/ii,... , 1 /jl be the polynomials 


constructed as in (2.8). Note that each ifjj is now a polynomial in (x,z), because 
the objective of (3.2) depends on x. Thus, each (x,y) feasible for <o> satisfies 

{x,y) = ■■■ = ip L {x,y) = 0. 

For convenience of notation, denote the polynomial tuples 

(3.3) £ := (G 1 ,...,G mi ,gi,---,g m2 ), ip ■= (i>i, ■ ■ ■ ,^l), 

We call ip(x, y) = 0 a Jacobian equation. Then, the SBPP as in is equivalent 
to the following SIPP: 

F* := min F(x.y) 

i xeK",yeR 1 ’ 

( 3 - 4 ) { s.t. i>{x,y) = 0, £(x,y) > 0, 

H(x,y,z) > 0, V z € Z. 


In the above, H(x,y,z) is defined as in (1.5). 


3.1. A semidefinite algorithm for SBPP. We have seen that the SBPP (1.1) 


is equivalent to (3.4), which is an SIPP. So, we can apply the exchange method to 


sol ve it . The basic idea of “exchange” is that we replace Z by a finite grid set Z k 
in (3.4), and then solve it for a global minimizer ( x k ,y k ) by Lasserre relaxations. 


If (x k , y k ) is feasible for (1.1), we stop; otherwise, we compute global minimizers 
of H(x k ,y k ,z ) and add them to Z Repeat this process until the convergence 


condition is met. We call (x*,y*) a global minimizer of (1.1), up to a tolerance 


parameter e > 0, if (x*,y*) is a global minimizer of the following approximate 

F(x, y) 

^(x,y) = 0, £(x,y) > 0, 

H (, x , y, z) > — e, V z £ Z. 


SIPP: 

( F* := min 

(3.5) 


^ s.t. 

















10 


JIAWANG NIE, LI WANG, AND JANE J. YE 


Summarizing the above, we get the following algorithm. 
Algorithm 3.1. (A Semidefinite Relaxation Algorithm for SBPP.) 


Input: Polynomials F, /, Gi,..., G mi , for the SBPP as in (1.1), a 

tolerance parameter e > 0, and a maximum number fc max of iterations. 


Output: The set X* of global nrinimizers of (1.1), up to the tolerance e. 

Step 1 Let Zq = 0, X* = 0 and k = 0. 

Step 2 Apply Lasserre relaxations to solve 


(3.6) 


(Pk): 


Ft ~ min 

i6R",y6lP 


F(x, y) 

s.t. ip(x,y) = 0, £{x,y) > 0, 
H(x,y,z) > 0 (Vze Z k ), 


and get the set S k = {(re*, 2/*), • ■ • , (a:j? , Vr k )} °f its global minimizers. 
Step 3 For each i = 1, • • • , r kl do the following: 

(a) Apply Lasserre relaxations to solve 


(3.7) 


(«Qi 


vf := mm 


H{xly*,z) 
s.t. i/>(x*,z) = 0, 

9i(z) > 0,.. .,g m2 (z) > 0, 


and get the set T* = : j = 1 , • • • ,£*} of its global minimizers. 

(b) If v\ > -e, then update X* := X* U {(x^y*)}. 

Step 4 If X* 0 or k > k max , stop; otherwise, update Z k to Z k +i as 

(3.8) Z k+1 := Z k U T* U 

Let k := k + 1 and go to Step [5] 


ur 


For the exchange method to solve the SIPP (3.4) successfully, the two subprob¬ 
lems (3.6) and (3.7) need to be solved globally in each iteration. This can be done 
by Lasserre’s hierarchy of semidefinite relaxations (cf. |2.1[ ). 

A) For solving (3.6) by Lasserre’s hierarchy, we get a sequence of monotonically 
increasing lower bounds for say, {pe}^i, that is, 

Pi < ■ ■ ■ < Pi < ■ • • < FI 

Here, £ is a relaxation order. If for some value of £ we get a feasible point 
(x,y) for (3.6) such that F{x,y) = pi , then we must have 

(3.9) F(x,y) = F* = p t , 

and know ( x , y) is a global minimizer. This certifies that the Lasserre’s re¬ 
laxation of order £ is exact and (3.6) is solved globally, i.e., Lasserre’s hier¬ 
archy has finite convergence. As recently shown in [30] . Lasserre’s hierarchy 
has finite convergence, when the archimedeanness and some standard con¬ 
ditions well-known in optimization to be generic (i.e., linear independence 
constraint qualification, strict complementarity and second order sufficiency 
conditions) hold. 

B) For a given polynomial optimization problem, there exist a sufficient (and 
almost necessary) condition for detecting whether or not Lasserre’s hier¬ 
archy has finite convergence. The condition is flat truncation , proposed 
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C) 


D) 


E) 


F) 


in [29]. It was proved in [29^ that Lasserre’s hierarchy has finite conver¬ 
gence if the flat truncation condition is satisfied. When the flat truncation 
condition holds, we can also get the point (x,y) in (3.91. In all of our nu¬ 
merical examples, the flat truncation condition is satisfied, so we know that 
Lasserre relaxations solved them exactly. There exist special optimization 
problems for which Lasserre relaxations are not exact (see e.g. HI Chapter 
5]). Even for the worst case that Lasserre’s hierarchy fails to have finite con¬ 
vergence, flat truncation is still the right condition for checking asymptotic 
convergence. This is proved in [29] §3]. 

In computational practice, semidefinite programs cannot be solved exactly, 
because round-off errors always exist in computers. Therefore, if F{x, y) ~ 
Pi, it is reasonable to claim that (3.61 is solved globally. This numerical 
issue is a common feature of most computational methods. 

For the same reasons as above, the subpro blem (3.7) can also be solved glob¬ 
ally by Lasserre’s relaxations. Moreover, (3.7) uses the equation i/j(x^, z) = 
0, obtained from Jacobian representation. As shown in [28], Lasserre’s hier¬ 
archy of relaxations, in combination with Jacobian representations, always 
has finite convergence, under some nonsingularity conditions. This result 
has been improved in m Theorem 3.9] under weaker conditions. Flat 
truncation can be used to detect the convergence (cf. [29] §4.2]). 

For all ei > £2 > 0, it is easy to see that F e * < F* 2 < F* and hence the 
feasible region and the optimal value of the bilevel problems are monotone. 
Indeed, we can prove lim F* = F* and the continuity of the optimal 

e—>0+ 

solutions; see [2D] Theorem 4.1] for the result and a detailed proof. However, 
we should point out that if e > 0 is not small enough, then the solution of 
the approximate bilevel program may be very different from the one for the 
original bilevel progra m. W e refer to J25J Example 4.1]. 

In Step 3 of A lgor ithm 3.1 the value of uf is a measure for the fea sibil ity 
of {xi,y!?) in (3.4). This is because (x\,y*) is a feasible point for (3.4) if 
and only if vf > 0. By usin g the exchange method, the subproblem (3.6) is 
only an appro xim ation for (3.4), so typically we have v , fc < 0 if ( xyf ) is 
infeasible for (3.4). The closer uf is to zero, the better (3.6) approximates 
(pl. 


3.2. Two features of the algorithm. As in the introduction, we do not apply 
the exchange method directly to (1.7), but instead to (3.4). Both (1.7) and (3.4) 
are SIPPs that are equivalent to the SBPP As the numerical experiments 

will show, the SIPP (3.4) is much easier to solve by the exchange method. This is 
because, the Jacobian equation i/j(x,y) = 0 in ( |3.4| ) makes it much easier for (3.6) 
to approximate (3.4) accurately. Typically, for a finite grid set Zk of Z, the feasible 
sets of (3.4) and (3.6) have the same dimension. However, the feasible set of (1.7) 
has smaller dimension than that of (1.8). Thus, it is usually very difficult for (1.8) 
to approximate 


much easier for 
following example. 


|1.7| ) accurately, by choosing a finite set Zk- In contrast, it is often 
3.6[) to approximate (3.4) accurately. We illustrate this fact by the 
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Example 3.19]) Consider the SBPP: 
F(x,y) :=xy-y + \y 2 


s.t. 1 - x 2 > 0, 1 - y 2 > 0, 

y £ S(x) := argmin f(x, z) := — xz 2 + \z^. 

l-2 2 >0 

Since f(x, z) = \{z 2 — x ) 2 — \x 2 , one can see that 

0, x £ [-1,0), 

± yfx, X £ [0, 1]. 

Therefore, the outer objective F(x,y) can be expressed as 

0, x £ [—1,0), 


S(x) = 


F{x,y) = < i 


X ± (x — l)\/x, X £ [0, 1]. 


So, the optimal solution and the optimal value of (3.10) are (a = 


%/l3—1 


): 


(x*,y*) = (a 2 , a) « (0.1886, 0.4343), F* = -a 2 + a 3 - a « -0.2581. 


If Algorithm 3.1 is applied without using the Jacobian equation ijj(x,y) = 0, the 
computational results are shown in Table [l] The problem (3.10) cannot be solved 
reasonably well. In the contrast, if we apply Algorithm 3.1 with the Jacobian 
equation ip(x, y) = 0, then (3.10) is solved very well. The computational results are 
shown in Table [2j It takes only two iterations for the algorithm to converge. 


Table 1. Computational results without if>(x,y) = 0 


Iter k 



n 


0 

(-M) 

4.098e-13 

-1.5000 

-1.5000 

1 

(0.1505, 0.5486) 

±0.3879 

-0.3156 

-0.0113 

2 

(0.0752, 0.3879) 

±0.2743 

-0.2835 

-0.0028 

3 

(0.2088, 0.5179) 

±0.4569 

-0.2754 

-0.0018 

4 

cannot be solved 





Table 2. Computational results with ip(x,y) = 0 


Iter k 

(Xi ,Vi) 


n 

< 

0 

(-id) 

3.283e-21 

-1.5000 

-1.5000 

1 

(0.1886,0.4342) 

±0.4342 

-0.2581 

-3.625e-12 


For the lower level program (1.2), the KKT conditions may fail to hold. In such 
a case, the classical methods which replace (1.2) by the KKT conditions, do not 
work at all. However, such problems can also be solved efficiently by Algorithm |3.1| 
The following are two such examples. 


Example 3.3. ([TD1 Example 2.4]) Consider the following SBPP: 

(3.11) F* := min (x — l) 2 + y 2 s.t. y £ S{x) := argmin x 2 z. 

zeR.yeR zeZ:={zeR|2 2 <o} 
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It is easy to see that the global minimizer of this problem is (x*,y*) = (1,0). The 
set Z = {0} is convex. By using the multiplier variable A, we get a single level 
optimization problem: 


r* := min (x — 1r + y 

:rER,2/£lR,A(E]R 


s.t. x 2 + 2A y = 0, A > 0, y 2 < 0, A y 2 = 0. 

The feasible points of this problem are (0, 0, A) with A > 0. We have r* = 1 > F*. 
The KKT reformulation approach fails in this example, since y* £ S(x*) is not a 


KKT point. We solve the SBPP problem (3.11) by Algorithm |3.1| The Jacobian 
equation is i/j(x,y) = x 2 y 2 = 0, and we reformulate the problem as: 

s := mm (x — 1) + y 

x£M.,y£M. 

s.t. x 2 (z — y) > 0, \/z € Z, 

i>(x,y) = x 2 y 2 = 0. 

This problem is not an SIPP actually, since the set Z only has one feasible point. 
At the initial step, we find its optimal solution (x*,y*) = (1,0), and it is easy to 
check that min H (x *, y *, z) = 0, which certifies that it is the global minimizer of 


the SBPP problem (3.11). 


Example 3.4. Consider the SBPP: 
(3.12) 


min F(x, y) := x + yi + y 2 

s.t. x — 2 > 0, 3 — £ > 0, 

y S S(x) := argmin f(x,z):=x{zi+z 2 ), 

zGZ 


where set Z is defined by the inequalities: 

9 i(z) ■■= z\-z\- [z\ + z 2 ) 2 > 0, g 2 (z) := zj. > 0. 

For all x £ [2, 3], one can check that S(x) = {(0,0)}. Clearly, the global minimizer 
of (3.12) is (x*,y*) = (2,0,0), and the optimal value F* = 2. At 2 * = (0,0), 


V z f(x,z*) = 


y z gi{z*) = 


y z g2(z*) = 


The KKT condition does not hold for the lower level program, since V-/(x,z*) 
is not a linear combination of V~gi( 2 *) and V Z g 2 (z*). By m Proposition 3.4], 
Lasserre relaxations in (2.2) do not have finite convergence for solving the lower 


level program. One can check that 

K fj ( x) = {(0,0), (0.8990, 0.2409)}[] 
for all feasible x. By Jacobian representation of Kpj{x), we get 

tp(x, z) = (xgi{z)g 2 {z), -xzi{z\ + z 2 + 2(z 2 - Zi)(zl + zf)), -xgi(z)\. 


Next, we apply Algorithm |3.1| to solve (3.12). Indeed, for k = 0, Z 0 = 0, we get 
(x w (2.0000,0.0000,0.0000), 
which is the true global minimizer. We also get 


2 ? w (4.6320,-4.6330) x 10“ 


-5.2510 x 10" 8 . 


'They are the solutions of the equations gi(z) = 0, z\ + z 2 + ' 2 (z 2 — zi)(zf + z^) = 0. 
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For a small value of e (e.g., 10 6 ), Algorithm 3.1 terminates successfully with the 
global minimizer of (3.12). 


3.3. Convergence analysis. We study the convergence properties of Algorithm 
3.1 For theoretical analysis, one is mostly interested in its performance when the 
tolerance parameter e = 0 or the maximum iteration number fc max = oo. 

Theorem 3.5. For the si mple b ilevel polynomial program as in assume the 

lower level program is as in fc3.<fy . Suppose the subproblems ( Pk ) and each ( Q k ) are 
solved globally by Lasserre relaxations. 

(i) Assume e = 0. If Algorithm 3.1 stops for some k < fc max , then each 
(x*, y*) £ X* is a global minimizer of (1.1). 

(ii) Assume e = 0, fc max = oo, and the union Uk>oZk is bounded. Suppose 
Algorithm 3.1 does not stop and each Sk ^ 0 is finite. Let (x* ,y*) be an 


arbitrary accumulation point of the set Uk>oSk- If the value function v(x), 
as in il.6 1 ), is continuous at x*, then (x*,y*) is a global minimizer of the 

SBPP problem E3- 

(Hi) Assume k max = oo, the union Uk>oZk is bounded, the set S = {(x,y) : 
if(x,y) = 0,l;(x,y) > 0} is compact. Let Ed = {x : 3y,(x,y) € S}, which 
is the projection of S onto the x-space. Suppose v(x ) is continuous on Si. 
Then, for all e > 0, Algorithm \3.1\ must terminate within finitely many 
steps, and each (x, y) £ X* is a global minimizer of the approximate SIPP 
l3l 


Proof, (i) The SBPP (1.1) is equivalent to (3.4). Note that each optimal value 


p* < p* anc j se q uence {i?*} is monotonically increasing. If Algorithm 


3.1 


stops at the fc-th iteration, then each (x*,y*) £ X* is feasible for (3.4), and also 
feasible for ( 1 . 1 ), so it holds that 

F* > Ff = F(x* ,y*) > F*. 


This implies that (x*,y*) is a global optimizer of problem (1.1). 


(ii) Suppose Algorithm 3.1 does not stop and each Sk 7 ^ 0 is finite. For each 


accumulation point [x*,y*) of the union U/^oSfc, there exists a sequence {kg} of 
integers such that kg — > 00 as i — > 00 and 

{x kt , y ke ) -t (x* , y*) , where each (x ke ,y ke ) £ S ke - 


Since the feasible set of problem (Pk e ) contains the one for problem (1.1), we have 


F, 


kg 


= F(x ke ,y ke ) < F* and hence F{x* 


,y*) < F* by the continuity of F. To 


show the opposite inequality it suffices to show th at (x *, y*) is feasible for problem 
(1.1). Recall that the function £ is defined as in (3.3). Since f{x ke ,y ke ) > 0 and 
ip(x ke ,y ke ) = 0 , by the continuity of the mappings we have l;(x*,y*) > 0 and 
if>{x*,y*) = 0. Define the function 

(3.13) 


<,Kx,y ) := infH(x,y,z). 

Z^iZ 


IS 


Clearly, <j)(x,y) = v(x) — f{x,y), and 4>{x*,y*) = 0 if and only if {x*,y 
a feasible point for By the definition of v(x) as in ( |1.6[ ) and that v(x) is 

continuous at x*, we always have <f>{x*,y*) < 0. To prove <f>{x*,y*) = 0, it remains 
to show <p(x*,y*) > 0. For all k' and for all kg > k ', the point ( x ke ,y ke ) is feasible 
for the subproblem ( Pk>), so 


H ( 


k e kg 

! y 


z) > 0 \/z £ Z k >. 
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Letting £ 
(3.14) 


oo, we then get 


The above is true for all k!. In Algorithm 
for some i, such that 


H(x*,y*,z)> 0 \/z£Z k ,. 

for each he, there exists z ke £ T k> . 


3.1 


00 


s kt ,V kt 


f ) = H{x ki ,y ke ,z ke 


)• 


Since z kc £ Z ke+ 1 , by (3.14), we know 

H(x*,y*,z k ‘)> 0. 

Therefore, it holds that 

<Kx*,y*) = 

(3.15) > 


<t>{x k ‘, y ke ) + <j)(x *, y *) - (t>{x kl , y**) 
- ff(x* 


r,* kt )]+ 


[<j)(x*,y*)-<t>{x k ‘,y kl )}. 

Since z kt belongs to the bounded set Ufc>o Z k , there exists a subsequence z ke - j such 
that z^’i -A 0 * £ Z. The polynomial H(x,y,z) is continuous at (x*,y *, z*). Since 
v(x) is continuous at x*, </>(x,y) = v(x) — f{x,y ) is also continuous at (x*,y*). 
Letting £ —» oo, we get (j>(x*,y*) > 0. Thus, (x*,y*) is feasible for (3.4) and so 
F(x*,y*) > F*. In the earlier, we already proved F(x*,y*) < F*, so (x*,y*) is a 
global optimizer of (3.4), i.e., (x*,y*) is a global minimizer of the SBPP problem 

O). 

(iii) Suppose otherwise the algorithm does not stop within finitely many steps. 
Then there exist a sequence {(x k ,y k ,z k )} such that ( x k ,y k ) £ S k , z k £ LlP^Tf, 

H(x k ,y k ,z k )<-e 

for all k. Note that ( x k ,y k ) £ 5 and z k £ Z k+ 1 . By the assumption that S 
is compact and U k >oZ k is bounded, the sequence {(x k ,y k , z k )} has a convergent 
subsequence, say, 

(x kl , y kt , z ke ) -A ( x*,y*,z*) as £ A oo. 

So, it holds that (x*,y*) £ S, z* £ Z and H(x*,y*,z*) < —e. Since S is compact, 
the projection set Si is also compact, hence x* £ Si. By the assumption, we know 
v(x) is continuous at x*. Similar to the proof in (ii), we have <j>{x*,y*) = 0, then 
(x*,y*) is a feasible point for (1.1), and we will get 


H(x*,y*,z*) = f(x*,z*) - f(x*, y*) > 0. 


must 


However, this contradicts that F£(x*,y*,z*) < —e. Therefore, Algorithm 3.1 
terminate within finitely many steps. 

Now suppose Algorithm 3.1 terminates within finitely many steps at (x, y) £ X* 
with e > 0. Then (x,y) must be a feasible solution to the approximate SIPP (3.5). 
Hence it is obvious that ( x,y ) is a global minimizer of (3.5). 

□ 


In Theorem 3.5 we assumed that the subproblems (P k ) and (Q k ) can be solved 
globally by Lasserre relaxations. This is a reasonably well assumption. Please see 
the remarks A)-D) after Algorithm 3.1 In the items (ii)-(iii), the value function 
v(x) is assumed to be continuous at certain points. This can be satisfied under 
some conditions. The restricted inf-compactness (RIC) is such a condition. The 
value function v(x) is said to have RIC at x* if v(x*) is finite and there exist a 
compact set Q and a positive number e 0 , such that for all ||a; — x*|| < eg with 
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v(x) < v(x*) + eo, there exists z £ S(x) fl Cl. For instance, if the set Z is compact, 
or the lower level objective f(x*,z) is weakly coercive in z with respect to set Z, 


i.e., 


lim f(x*,z) = oo, 

z£Z,\\z\\—>oo 


then v(x) has restricted inf-compactness at x*\ see, e.g., B §6-5.1]. Note that the 
union Ufc>o Z^ is contained in Z. So, if Z is compact then Ufc>o Z^ is bounded. 

Proposition 3.6. For the SBPP problem Q. assume the lower level program is 
as in (3.2). If the value function v(x) has restricted inf-compactness at x*, then 


v{x) is continuous at x*. 

Proof. On one hand, since the lower level constraint is independent of x, the value 
function v(x) is always upper semicontinuous [21 Theorem 4.22 (1)]. On the other 
hand, since the restricted inf-compactness holds it follows from [SJ page 246] (or see 
the proof of Theorem 3.9]) that v(x) is lower semicontinuous. Therefore v(x) 
is continuous at x*. □ 


4. General Bilevel Polynomial Programs 


In this section, we study general bilevel polynomial programs as in (1.1). For 


GBPPs, the feasible set Z(x) of the lower level program (1.2) varies as x changes, 
i.e., the constraining polynomials gj(x,z) depends on x. 


For each pair (x, y) that is feasible for (1.1), y is an optimizer for the lower level 
program ( |1.2[ ) parameterized by x. so y must be a Fritz John point of (1.2), i.e., 
there exists (mil Mi, • ■ • > Mm 2 ) 7^ 0 satisfying 


Mo Vzf(x,y)~ VjVz 9 j{x,y) = 0, 

je[m 2 ] 


hjhjix, y) = 0(j £ [m 2 ]). 


For convenience, we still use Kpj(x) to denote the set of Fritz John points of (1.2) 


at x. The set Kpj(x) consists of common zeros of some polynomials. As in (2.3), 
choose the polynomials (f(z),g 1 (z), ...,g m (z)) to be (f(x, z),gi(x,z),.. .,g m2 (x, z)), 
whose coefficients depend on x. Then, construct if >\,..., i/’l hr the same way as in 


(2.8). Each tfj is also a polynomial in (x,z). Thus, every (x,y) feasible in (1.1) 


satisfies ipj(x,y) = 0, for all j. For convenience of notation, we still denote the 
polynomial tuples £,-)/> as in (3.3). 


We have seen that (1.1) is equivalent to the generalized semi-infinite polynomial 


program ( H(x,y,z ) is as in (1.5)): 


F* := 


(4.1) 


mm 

xeR n ,yeR p 

s.t. 


F{x,y) 

ip{x,y) = 0, £(x,y) > 0, 
H(x,y,z ) > 0, V z £ Z(x). 


Note that the constraint H(x,y,z ) > 0 in (4.1) is required for 2 £ Z(x), which 
depends on x. Algorithm 3.1 can also be applied to solve (4.1). We first give an 


example for showing how it works. 
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Example 4.1. ([24j Example 3.23]) Consider the GBPP: 

2 


(4.2) 


mm x 
z,ye[-i,i] 


s.t. 1 + x — 9ar —y< 0, 

y £ argmin {2 s.t. z 2 (cc — 0.5) < 0}. 


By simple calculations, one can show that 
'{0}, x e (0.5, l], 


Z(x) = 


[-1,1], *€[-1,0.5], 


S(x) = 


{0}, x G (0.5,1], 
{-1}, *€[-1,0.5]. 


The set U = {(a;, y) £ [—1, l] 2 : 1 + * — 9a; 2 — y < 0, y 2 {x — 0.5) < 0}. The feasible 
set of (4.2) is: 

T := ^{(*,0) : x £ (0.5,1]} U {(*,-1) : x £ [-1,0.5]}) DU. 

One can show that the global minimizer and the optimal values are 


(**,»*) = 


1-V73 

18 


,-l « (-0.4191,-1), F* = 


'i-Vn" 

18 


0.1757. 


By the Jacobian representation of Fritz John points, we get the polynomial 

ip(x, y) = (x- 0.5 )y 2 {y 2 - 1). 


We apply Algorithm 3.1 to solve (4.2). The computational results are reported in 
Table [3j As one can see, Algorithm 3.1 takes two iterations to solve (4.21 success- 


Table 3. Results of Algorithm 3.1 for solving (4.2). 


Iter k 



n 

v1 

0 

(0.0000, l(oooo) 

- 1.0000 

0.0000 

-2.0000 

1 

(-0.4191, -1.0000) 

- 1.0000 

0.1757 

-2.4e-ll 


fully. □ 

However, we would like to point out that Algorithm |3 .1 1 might not solve GBPPs 
globally. The following is such an example. 


Example 4.2. ([231 Example 5.2]) Consider the GBPP: 


(4.3) 


mm 

S.t. 


(x — 3) 2 + (y — 2) 2 
0 < x < 8, y £ S(x), 


where S(x) is the set of minimizers of the optimization problem 


min (z — 5) 2 

ze r 

s.t. 0 < z < 6, — 2x + z — 1 < 0, 

x — 2z + 2 < 0, x + 2z — 14 < 0. 
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It can be shown that 


S(x) = < 


{l + 2x}, 

{5}, 

<’-!>■ 


X € [0,2], 
* e (2,4], 

x G (4,6], 
x G (6,8]. 


The feasible set of (4.3) is thus the set 

F ■= {(x 7 y) | x G [0,6], y G ^(a:)}. 

It consists of three connected line segments. One can easily check that the global 
optimizer and the optimal values are 

(s*,y*) = (1,3), F* = 5. 

The polynomial 0 in the Jacobian representation is 

ip ( x , y ) = (-2x + y - l)(x - 2y + 2)(x + 2y - 14 )y(y - 6)(y - 5). 


We apply Algorithm 3.1 to solve (4.3b _ The computational results are reported 

in Table |4j For e = 10 -6 , Algorithm 3.1 stops at k = 1, and returns the point 


Table 4. Results of Algorithm 3.1 for solving (4.3). 


Iter k 



K 

< 

0 

(2.7996, 2.3998) 

5.0021 

0.2000 

-6.7611 

1 

(2.9972, 5.0000) 

5.0021 

9.0001 

4.41e-6 


(2.9972, 5.0000), which is not a global minimizer. However, it is interesting to note 
that the computed solution (2.9972,5.0000) ss (3,5), a local optimizer of problem 


(4.3). 


□ 


Why does Algorithm |3.1| fail to find a global minimizer in Example |4.2| ? By 
adding z° to the discrete subset Z lt the feasible set of (Pi) becomes 


[iGl,!/G Z(x)} n {0(x, y) = 0} n {|y - 5| < 0.0021}. 


It does not include the unique global optimizer (x*,y*) = (1,3). In other words, 
the reason is that H(x*,y*,z °) > 0 fails to hold and hence by adding z°, the true 
optimal solution (x*,y*) is not in the feasible region of problem (Pi). 

From the above example, we observe that the difficulty for solving GBPPs glob¬ 
ally comes from the dependence of the lower level feasible set on x. For a global op¬ 
timizer (x*,y*), it is possible that H(x*,y*,zF) ^ 0 for some zF at some step, i.e., 


(x*,y*) may fail to satisfy the newly added constraint in (Pf.+i): H(x,y, zF) > 0. 
In other words, (x*,y*) may not be feasible for the subproblem (Pfc+i). Let Xk be 
the feasible set of problem (Pfc). Since Z k C Z k+ i, we have A fc+1 C X k and (x*,y*) 
is not feasible for (Pe), for all £ > k + 1. In such case, Algorithm 3.1 will fail to 
find a global optimizer. However, this will not happen for SBPPs, since Z{x) = Z 
for all x. For all z € Z 7 we have H(x*,y*,z ) > 0, i.e., (x*,y*) is feasible for all 
subproblems (Pfc). This is why Algorithm 3.1 has convergence to global optimal 
solutions for solving SBPPs. However, under some further conditions, Algorithm 
|3.1|can solve GBPPs globally. 
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Theorem 4.3. For the general bilevel polynomial program as in (1.1 ), assume that 
the lower level program is as in {!■£§ and the minimum value F* is achievable at a 
point ( x, y ) such that H(x, y,z) > 0 for all z £ Zk and for all k. Suppose (P&) and 
(Qi) are solved globally by Lasserre relaxations. 


(i) Assume e = 0. If Algorithm 3.1 stops for some k < fc max , then each 


(x*,y*) £ X* is a global minimizer of the GBPP problem \1.1\ . 

(ii) Assume e = 0, fc max = oo, and the union Uk>oZk is bounded. Suppose 
Algorithm 3.1 does not stop and each Sk 0 is finite. Let (x* ,y*) be an 


arbitrary accumulation point of the set Ufc>o Sk- If the value function v(x), 
defined as in (1.6), is continuous at x*, then (x*,y*) is a global minimizer 
of the GBPP problem 0- 

(Hi) Assume k max = oo, the union Uk>oZk is bounded, the set S = {(x,y) : 
i/>(x,y) = 0 ,£(x,y) > 0} is compact. Let Si = {x : 3y,(x,y) £ S}, the 
projection of S onto the x-space. Suppose v[x) is continuous on Si- Then, 
for all e > 0, Algorithm\3.1\ must terminate within finitely many steps. 


Proof. By the assumption, the point (x,y) is feasible for the subproblem (P*.), for 
all k. Hence, we have F( < F*. The rest of the proof is the same as the proof of 
Theorem 13.51 □ 


In the above theorem, the existence of the point (x, y) satisfying the require¬ 
ment may be hard to check. If v(x) has restricted inf-compactness at x* and the 
Mangasarian-Fromovitz constraint qualification (MFCQ) holds at all solutions of 
the lower level problem (1.21, then the value function v[x) is Lipschitz continuous 
at x*\ see JSJ Corollary 1], Recently, it was shown in [121 Corollary 4.8] that the 
MFCQ can be replaced by a weaker condition called quasinormality in the above 
result. 


5. Numerical experiments 

In this section, we present numerical experiments for solving BPPs. In Algo¬ 
rithm |3.1[ the polynomial optimization subproblems are solved by Lasserre semi- 
definite relaxations, implemented in software Gloptipoly 3 m and the SDP solver 
SeDuMi [35]. The computation is implemented with Matlab R2012a on a MacBook 
Pro 64-bit OS X (10.9.5) system with 16GB memory and 2.3 GHz Intel Core i7 
CPU. In the algorithms, we set the parameters fc max = 20 and e = 10 -5 . In report¬ 
ing computational results, we use (x*,y*) to denote the computed global optimizers, 
F* to denote the value of the outer objective function F at {x*,y*), v* to denote 
inf Z £z H(x*, y*, z), Iter to denote the total of number of iterations for conver¬ 
gence, and Time to denote the CPU time taken to solve the problem (in seconds 
unless stated otherwise). When v* > —e, the computed point (x*,y*) is considered 
as a global minimizer of (P), up to the tolerance e. Mathematically, to solve BPPs 
exactly, we need to set e = 0. However, in computational practice, the round-off 
errors always exist, so we choose e > 0 to be a small number. 


5.1. Examples of SBPPs. 
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Example 5.1. (j24j Example 3.26]) Consider the SBPP: 


min xryx + x 2 y% + xix 2 y% 

*eR 2 ,yeR 3 

s.t. x £ [— 1 , l] 2 , 0.1 — xf < 0 , 

1.5 - y'i - y'i - vi < 0, 

-2.5 + y\ + y\ + y\ < 0, 
y e S{x), 

where S(x) is the set of minimizers of 

min ^ xizf + x 2 z% + (aq — x 2 )z 3 . 

It was shown in [MJ Example 3.26] that the unique global optimal solution is 

x* = (— 1 ) —l)j y* = { i,±i,-Voj). 

Algorithm |3. 1 1 terminates after one iteration. It takes about 14.83 seconds. We get 
s*»(-l,-l), y* ~ (1, ±1,—0.7071), 

F* w -2.3536, u* « -5.71 x 10~ 9 . 

Example 5.2. Consider the SBPP: 

xiyi + x 2 y 2 + xix 2 yiy 2 y 3 


(5.1) 


mm 

xeR 2 ,yeR 3 

s.t. x £ [— 1 , l] 2 , 2 / 12/2 -xi < 0 , 
y e S{x), 


where S(x) is the set of minimizers of 

{ min Xizf + * 2^2 ^3 — z%z 3 
ze R 3 

s.t. 1 < z\ + z\ + z\ < 2. 

Algorithm |3. 1 1 terminates after one iteration. It takes about 13.45 seconds. We get 
x* « (-1, -1), y* « (1.1097,0.3143, -0.8184), 

F* « -1.7095, v* « -1.19 x 10" 9 . 


By Theorem 3.5 we know (a ’*,y*) is a global optimizer, up to a tolerance around 
IQ- 9 . 


Example 5.3. We consider some test problems from J23]. For convenience of 
display, we choose the problems that have common constraints x £ [—1,1] for the 
outer level program and z £ [—1,1] for the inner level program. When Algorithm |3.1| 
is applied, all these SBPPs are solved successfully. The outer objective F(x, y), the 
inner objective f(x,z), the global optimizers (x*,y*), the number of consumed 
iterations Iter, the CPU time taken to solve the problem, the optimal value F*, 
and the value v* are reported in Table[5] In all problems, except Ex. 3.18 and Ex. 
3.19, the optimal solutions we obtained coincide with those given in [24] . For Ex. 
3.18, the global optimal solution for minimizing the upper level objective — x 2 + y 2 
subject to constraints x, y £ [—1,1] is x* = 1, y* = 0. It is easy to check that y* = 0 
is the optimal solution for the lower level problem parameterized by x* = 1 and 
hence x* = 1, y* = 0 is also the unique global minimizer for the SBPP in Ex. 3.18. 
For Ex. 3.19, as shown in [21] . the optimal solution must have x* £ (0,1). For such 
x*, S(x*) = {±v/x*}. Plugging y = ±<Jx into the upper level objective we have 
F{x, y) = Fxyjx + y^x + f • It is obvious that the minimum over 0 < x < 1 should 
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occur when y = *Jx. So minimizing F(x, y) = Xyfx — yfx + | over 0 < x < 1 gives 
x* = 0.1886, y* = ^f=!« 0.4343. 

Table 5. Results for some SBPP problems in [24]. They have the 
common constraints x £ [—1,1] and z £ [—1,1]. 


Problem 

SBPP 


Iter 

Time 

F* 

V* 

Ex. 3.14 

f = {x- i/ay + y z 

f = 2 3 / 3 — xz 

(0.2500, 0.5000) 

2 

0.49 

0.2500 

-5.7e-10 

Ex. 3.15 

F = X + y 
/ = xz 2 /2 - z 3 /3 

(-1.0000, 1.0000) 

2 

0.42 

2.79e-8 

-4.22e-8 

Ex. 3.16 

F = 2x + y 
f = -xz 2 /2 - z 4 /4 

(-0.5, -1), (-1, 0) 

2 

0.47 

-2.0000 

-6.0e-10 

Ex. 3.17 

F = (x + 1/2) 2 + y 2 /2 

f = xz 2 / 2 + z i /A 

(-0.2500, ±0.5000) 

4 

1.12 

0.1875 

-8.3e-ll 

Ex. 3.18 

F — —x 2 + y z 

/ — xz 2 — z 4 /2 

(1.0000, 0.0000) 

2 

0.44 

-1.0000 

-3.1e-13 

Ex. 3.19 

F = xy -y + y 1 /2 

/ = -xz 2 + z 4 /2 

(0.1886, 0.4343) 

2 

0.41 

-0.2581 

-3.6e-12 

Ex. 3.20 

F = (x — 1/AY + y* 

f — z 3 /3 — x 2 z 

(0.5000, 0.5000) 

2 

0.38 

0.3125 

-l.le-10 


Example 5.4. Consider the SBPP: 


mm 

zeKbyeR 4 

s.t. 


x\yi + x 2 y2 + aj 3 y| + Xiyl 


fII < i) yiy -2 - zi < o, 


(5.2) 

2/32/4 - x\ < 0, y G 5(x), 
where S'(ai) is the set of minimizers of 

z\ - z 2 (x x + x 2 ) - {z 3 + z 4 )(£3 + z 4 ) 


mm 

zeM 4 

s.t. 


< 1, z\ + z\ + Za - zi < 0. 


We apply Algorithm 3.1 to solve (5.2). The computational results are reported 
in Table [6] As one can see, Algorithm 3.1 stops when k = 4 and solves (5.2) 


we 


successfully. It takes about 20 minutes to solve the problem. By Th eorem 3.5 
know the point (x^y^) obtained at k = 4 is a global optimizer for (5.2), up to a 
tolerance around 10~ 8 . 


Table 6. Results of Algorithm 3.1 for solving (5.2). 


Iter k 


n 

< 

0 

(-0.0000,1.0000,-0.0000,0.0000,0.6180,-0.7862, 0.0000, 0.0000) 

-0.7862 

-1.6406 

1 

(0.0000,-0.0000,0.0000,-1.0000,0.6180, -0.0000,0.0000,-0.7862) 
(0.0003,-0.0002,-0.9999,0.0000,0.6180, 0.0001,-0.7861,-0.0000) 

-0.6180 

-0.6180 

-0.3458 

-0.3458 

2 

(0.0000,-0.0000,-0.8623,-0.5064,0.6180,-0.0000,-0.6403,-0.4561) 

-0.4589 

-0.0211 

3 

(0.0000,-0.0000,-0.7098,-0.7042,0.6180,-0.0000,-0.5570,-0.5548) 

-0.4371 

-6.37e-5 

4 

(0.0000,-0.0000,-0.7071,-0.7071,0.6180,0.0000,-0.5559,-0.5559) 

-0.4370 

-2.27e-8 


An interesting special case of SBPPs is that the inner level program has no 
constraints, i.e., Z = R p . In this case, the set Kpj(x) of Fritz John points is just 
the set of critical points of the inner objective f(x,z). It is easy to see that the 
polynomial ijj(x,y) is given as 

iiz.z)=(zLnx,z),....zLnx.z)). 
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Example 5.5. (SBPPs with Z = R p ) Consider random SBPPs with ball conditions 
on x and no constraints on 


(5.3) 


F* := min F(x.y) 
xeR n ,yeRp 

s.t. ||m|| 2 < 1, y £ argmin/(a;, z), 

zeRp 


where F(x,y) and f(x,z) are generated randomly as 
F(x,y) := af [u] 2 d l _i + ||-Bi[u] dl 1| 2 , 

f(x,z ) := a 2 [x\ 2d2 -i +al[z\ 2 d 2 -1 + 

In the above, x = (aq,.. .,x n ), y = (yi, .. .,y p ), z = (zi, .. .,z p ), u = (x,y) and 
di,d 2 € N. The symbol [x\d denotes the vector of monomials in x and of degrees 
< d , while [x] d denotes the vector of monomials in x and of degrees equal to d. The 
symbols [y]d, [y] d , [a] d are defined in the same way. 



Table 7. 


Results for random SBPPs as in (5.3) 


n 

P 


d 2 

Iter 

Time 

V* 


Min 

Avg 

Max 

Min 

Avg 

Max 

Min 

Avg 

Max 

2 

3 

3 

2 

1 

1.9 

5 

00:01 

00:02 

00:06 

-3.8e-6 

-2.9e-7 

-4.32e-8 

3 

3 

2 

2 

1 

1.6 

2 

00:04 

00:07 

00:09 

-4.0e-6 

-3.7e-7 

-l.le-10 

3 

3 

3 

2 

1 

1.7 

2 

00:04 

00:07 

00:10 

-2.0e-6 

-2.6e-7 

-7.4e-ll 

4 

2 

2 

2 

1 

1.4 

3 

00:04 

00:06 

00:09 

-3.0e-6 

-2.4e-7 

-4.9e-12 

4 

3 

2 

2 

1 

2.3 

5 

00:15 

00:41 

01:36 

-5.3e-6 

-6.4e-7 

-4.67e-9 

5 

2 

2 

2 

1 

1.9 

4 

00:14 

00:33 

01:13 

-3.5e-6 

-8.1e-7 

-4.3e-ll 

5 

3 

2 

2 

1 

1.8 

3 

06:30 

10:04 

11:56 

-l.le-6 

-3.8e-7 

-1.9e-10 

6 

2 

2 

2 

1 

2.0 

4 

04:02 

09:56 

17:39 

-6.2e-6 

-1.5e-6 

-5.57c-7 


We test the performance of Algorithm 3.1 for solving SBPPs in the form (5.3). 


The computational results are reported in Table [7] In the table, we randomly 
generated 20 instances for each case. Avglter denotes the average number of 
iterations taken by Algorithm |3.1[ AvgTime denotes the average of consumed time, 
and Avg(u*) denotes the average of the values v*. The consumed computational 
time is in the format mn:sc, with mn and sc standing for minutes and seconds 
respectively. As we can see, these SBPPs were solved successfully. In Table [7J 
the computational time in the last two rows are much bigger than those in the 
previous rows. This is because the newly added Jacobian equation ip(x,y) = 0 has 
more polynomials and has higher degrees. Consequently, in order to solve (Pfc) and 
(Qi) globally by Lasserre relaxations, the relaxation orders need to be higher. This 
makes the semidehnite relaxations more difficult to solve. 


Example 5.6. (Random SBPPs with ball conditions) Consider the SBPP: 
(5.4) 


min Fix. y) 

xeR n ,yeRP 


s.t. ||x|| 2 <1, ye argmin f(x, z). 

IMI 2 <i 

The outer and inner objectives F(x,y), f(x,z) are generated as 

T 


F(x, y) = a [(a, y )\ 2dl , /( x, z) = 


X\d 2 
["] d 2 


B 


X\d 2 
[~] d 2 
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The entries of the vector a and matrix B are generated randomly obeying Gaussian 
distributions. The symbols like [(a;, j/)] 2 ( j 1 are defined similarly as in Example 5.5 


We apply Algorithm 3.1 to solve (5.4). The computational results are reported in 
Table [ 8 ] The meanings of Inst. Avglter, AvgTime, and Avg(u*) are same as in 
Example |5.5| As we can see, the SBPPs as in (5.41 can be solved successfully by 
Algorithm |3.1 1 


Table 8. Results for random SBPPs in (5.4). 


n 

P 

di 

d2 

Iter 

Time 

V* 

Min 

Avg 

Max 

Min 

Avg 

Max 

Min 

Avg 

Max 

3 

2 

2 

2 

1 

2.6 

6 

00:01 

00:03 

00:06 

-7.4e-7 

-1.4e-7 

2.0e-9 

3 

3 

2 

2 

1 

2.7 

6 

00:03 

00:09 

00:21 

-2.6e-6 

-6.5e-7 

-1.5e-9 

3 

3 

3 

2 

1 

3.0 

5 

00:03 

00:09 

00:17 

-2.9e-6 

-3.6e-7 

-l.le-9 

4 

2 

2 

2 

1 

3.5 

8 

00:03 

00:20 

00:43 

-1.8e-6 

-5.0e-7 

1.4e-9 

4 

3 

2 

2 

1 

2.6 

5 

00:12 

00:31 

01:01 

-2.9e-6 

-3.0e-7 

1.8e-9 

5 

2 

2 

2 

1 

3.7 

11 

00:11 

00:43 

02:06 

-3.9e-6 

-1.7e-7 

-3.4e-9 

5 

2 

3 

2 

1 

3.4 

10 

00:10 

00:41 

02:15 

-3.6e-6 

-5.4e-7 

-1.5e-9 

6 

2 

2 

2 

1 

2.6 

6 

03:21 

09:17 

22:41 

-4.3e-6 

-5.7e-7 

5.8e-10 

6 

2 

3 

2 

1 

2.4 

5 

03:15 

08:23 

17:42 

-6.2e-7 

-1.5e-7 

2.7e-10 


5.2. Examples of GBPPs. 

Example 5.7. Consider the GBPP: 

{ min \x\y\ + x 2 y\ - (xi + x%)y 3 
®eR 2 ,2/eR 3 

s.t. x e [- 1 , l] 2 , x\ + x 2 - x\ - y\ - y\ > 0 , 
y e S(x), 

where S(x) is the set of minimizers of 

min x 2 (ziz 2 z 3 + z\ — z 3 ) 


seR 3 

s.t. 


x\ — zf — z 2 — zf > 0 , 1 — 2 z 2 t 3 > 0 . 


We apply Algorithm 3.1 to solve (5.5). Algorithm 3.1 terminates at the iteration 
k = 0. It takes about 10.18 seconds to solve the problem. We get 

x* « (1,1), y* « (0,0,1), Fq « -2, v* « —2.95 x 10“ 8 . 

Since Zq = 0, we have Fq < F* (the global minimum value). Moreover, (x*,y*) is 
feasible for (5.51, so F(x*,y*) > F*. Therefore, F(x*,y*) = F* and ( x*,y*) is a 
global optimizer, up to a tolerance around 10 -8 . 

Example 5.8. Consider the GBPP: 

(xi +x 2 + x 3 + x 4 )(?/l +V 2 + V3 + Vi) 

s.t. ||x || 2 < 1 , yl - £4 < 0 , 

2 / 22/4 - X! < 0, y e S(x ), 

where S(x) is the set of minimizers of 

min XiZ\ + x 2 z 2 + O.I 23 + 0.52 4 — 2324 
zeR 4 

s.t. 2 2 + 2 z\ + 3z§ + 4 z\ <x\ + x 3 +x 2 + X 4 , 

Z 2 23 — 2 4 2 4 > 0. 


(5.6) 


mm 

a;eR 4 ,'yeM 4 


We apply Algorithm 3.1 to solve (5.6 1 . The computational results are reported in 
Table [9] Algorithm |3.1| stops with k = 1. It takes about 490.65 seconds to solve 
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the problem. We are not sure whether the point computed at k = 1 is a 

global optimizer or not. 


Table 9. Results of Algorithm 3.1 for solving 

(5.61. 

Iter k 


n 

K 

0 

(0.5442,0.4682,0.4904,0.4942,-0.7792,-0.5034,-0.2871,-0.1855) 

-3.5050 

-0.0391 

1 

(0.5135,0.5050,0.4882,0.4929,-0.8346,-0.4104,-0.2106,-0.2887) 

-3.4880 

3.29e-9 


Example 5.9. In this example we consider some GBPP examples given in the 
literature. The problems and the computational results are displayed in Table [T0| 
Problem 1 is [2 Example 3.1] and the optimal solution (x*,y*) = (0, 0) is reported. 
Problem 2 is [HI Example 4.2] and the optimal solution (x*,y*) = (1,1) is reported. 
Problem 3 is [22] Example 3.22], As shown in [21], the optimal solution should 
attain at a point satisfying 0 < x < 1 and y = —0.5 + 0.1a:. For (x,y) satisfying 
these conditions, the lower level constraint 0 . 01(1 + x 2 ) — y 2 < 0 becomes inactive. 
Plugging y = —0.5+0.la; into the upper level objective, the bilevel program becomes 
finding the minimum of the convex function (x — 0.6 ) 2 + (—0.5 + 0.1a;) 2 . Hence the 
optimal solution is (x*,y*) = (y§|, Tfk)- Problem 4 can be found in [2H Example 
4.2] with the optimal solution (x*,y*) = (1,0,1) reported. Problem 5 can be 
found in [24] Example 5.1] where the optimal solution (x*,y*) = (5,4, 2) is derived. 
Problem 6 is BUI Example 3.1]. As shown in m, the optimal solution is (a;*, y*) = 
(a/0.5, x/OA). Problem 7 was originally given in [3. Example 3] and analyzed in [Tj. 
It was reported in [1] that the optimal solution is x* = (0,2 ),y* ss (1.875,0.9062). 
In fact we can show that the optimal solution is a:* = (0, 2), y* = ||) as follows. 

Since the upper objective is separable in x and y. it is easy to show that the optimal 
solution for the problem 

min —a; 2 — 3a;2 — 4yi + y\ s.t. — a; 2 — 2a,’2 + 4 > 0 

with 3 /i, 2/2 fixed is x\ = 0 , 2:2 = 2. Since y* = (^, |§) is the optimal solution to 
the lower level problem parameterized by x* = (0, 2), we conclude that the optimal 


stops m very few steps with global optimal solutions for all problems. 

6. Conclusions and discussions 

This paper studies how to solve both simple and general bilevel polynomial pro¬ 
grams. We reformulate them as equivalent semi-infinite polynomial programs, using 
Fritz John conditions and Jacobian representations. Then we apply the exchange 
technique and Lasserre type semidefinite relaxations to solve them. For solving 
SBPPs, we proposed Algorithm |3.1| and proved its convergence to global optimal 
solutions. For solving GBPPs, Algorithm |3.1| can also be applied, but its conver¬ 
gence to global optimizers is not guaranteed. However, under some assumptions, 
GBPPs can also be solved globally by Algorithm |3.1| Extensive numerical exper¬ 
iments are provided to demonstrate the efficiency of the proposed method. To 
see the advantages of our method, we would like to make some comparisons with 
two existing methods for solving bilevel polynomial programs. The first one is the 
value function approximation approach proposed by Jeyakumar, Lasserre, Li and 
Pham B3; the second one is the branch and bound approach proposed by Mitsos, 
Lemonidis and Barton [26] . 


solution is a;* = (0,2), y* = (^, 4§). From Table 10 we can see that Algorithm 


3.1 
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Table 10. Results for some GBPPs 



6.1. Comparison with the value function approximation approach. For 

solving SBPPs with convex lower level programs, a semidefinite relaxation method 
was proposed in ,161 §3], under the assumption that the lower level programs satisfy 
both the nondegeneracy condition and the Slater condition. It uses multipliers, 
appearing in the Fritz John conditions, as new variables in sum-of-squares type 
representations. For SBPPs with nonconvex lower level programs, it was proposed 
in [161 §4] to solve the following e-approximation problem (for a tolerance parameter 
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e > 0) 


( 6 . 1 ) 


:= min 

xeR n ,yeRp 


F(x,y) 


(Pe ) ■■ 


s.t. Gi(x,y) > 0, i = 1, • • • ,TOi, 
9j{y) > 0, j = 1, ■ • ■ ,m 2 , 
f(x,y) - Jk{x) < e. 


In the above, Jk{x) € H^l-Ia] is a ^-solution for approximating the nonsmooth 
value function v(x) Hi, Algorithm 4.5]. For a given parameter e > 0, the method 
in ;T5I §4] finds the approximating polynomial Jk(x) first, and then solves (P e fe ) by 
Lasserre type semidefinite relaxations. Theoretically, e > 0 can be chosen as small 
as possible. However, in computational practice, when e > 0 is very small, the 
degree 2k need to be chosen very high and then it is hard to compute Jk(x). In 
the following, we give an example to compare our Algorithm |3.1| and the method 
in [HI §4]. 


Example 6.1. Consider the following SBPP: 

( 6 . 2 ) 


F * := i? in 2/i ( x i ~ 3 x i x 2) - vlv2 + yix\ 

x£K 2 ,y£K z 


s.t. x e [-1, If, 2/2 + 2/1 (1 - x{) > 0, y e S(x), 
where S(x) is the solution set of the following optimization problem: 

,2 „3 „,2 


v(x ) := min z\zi — Zn — zUX 2 — xi) 
K ’ zSR 2 


S.t. Zi + z. 


2 , .2 < L 


The computational results of applying Algorithm |3.1| is shown in Table [TT] It took 
only two steps to solve the problem successfully. The set U is compact. For each x, 
S(x) 7^ 0, since the lower level program is defined as a polynomial over a compact 
set. The val ue f unction v(x) of lower level program is continuous. The feasible set 
of problem (6.2) is nonempty and compact. At the iteration k = 1, the value 'tA 


is almost ze ro, s o the point (0.5708, —1.0000, —0.1639,0.9865) is a global optimizer 
of problem (6.2), up to a tolerance around 10~ 9 . 


Table 11. 


Computational results of Algorithm 3.1 for solving (6.2). 


Iter k 

Of ,y, fc ) 


n 


0 

( 1.0000,-1.0000,-1.0000,0.0000) 
(-1.0000,1.0000,-1.0000,0.0000) 

(-0.1355,0.9908) 

(-0.2703,0.9628) 

-4.0000 

-4.0000 

-3.0689 

-1.1430 

1 

(0.5708,-1.0000,-0.1639,0.9865) 

(-0.1638,0.9865) 

-1.0219 

-4.76e-9 


Next, we apply the method in m §4]- We use the software Yalmip [2T] to 
compute the approximating polynomial Jk{x) € as in [16} Algorithm 4.5]. 

After that, we solve the problem (P e fc ) by Lasserre type se mide finite relaxations, for 
a parameter e > 0. Let F* de note the optimal value of (6.1). The computational 

As e is close to 0, we can see that Ff is close to the 
— 1.0219. Since the method in [16] depends on the choice 
of e > 0, we do not compare the computational time. In applications, the optimal 
value F* is typically unknown. An interesting question for research is how to select 
a value of e > 0 that guarantees F e is close enough to F*. 


results are shown in Table 
true optimal value F 
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Table 12. Computational results of the method in [IBJ §4]. 


e 

F e 


F e 

1.0 

-3.4372 

-3.6423 

-3.6439 

0.5 

-1.5506 

-1.5909 

-1.5912 

0.25 

-1.2718 

-1.2746 

-1.2750 

0.125 

-1.1746 

-1.1775 

-1.1779 

0.05 

-1.1193 

-1.1224 

-1.1228 

0.01 

-1.0897 

-1.0930 

-1.0934 

0.005 

-1.0858 

-1.0892 

-1.0897 

0.001 

-1.0827 

-1.0862 

-1.0867 

0.0001 

-1.0820 

-1.0855 

-1.0860 


6.2. Comparison with the branch and bound approach. Mitsos, Lemonidis 
and Barton [261 proposed a bounding algorithm for solving bilevel programs, in 
combination with the exchange technique. It works on finding a point that satisfies 
e-optimality in the inner and outer programs. For the lower bounding algorithm, a 
relaxed program needs to be solved globally. The optional upper bounding problem 
is based on probing the solution obtained by the lower bounding procedure. The 
algorithm can be extended to use branching techniques. For cleanness of the paper, 
we do not repeat the details here. Interested readers are referred to [25]. We list 
some major differences between the method in our paper and the one in [26) . 

• The method in [26] is based on building a tree of nodes of subproblems, 
obtained by partitioning box constraints for the variables x, y. Our method 
does not need to build such a tree of nodes and does not require box con¬ 
straints for partitioning. 

• For each subproblem in the lower/upper bounding, a nonlinear nonconvex 
optimization, or a mixed integer nonlinear nonconvex optimization, need 
to be solved globally or with e-optimality. The software GMAS [31] and 
BARON [SB] are applied to solve them. In contrast, our method does not 
solve these nonlinear nonconvex subproblems by BARON and GMAS. Instead, 
we solve them globally by Lasserre type semidefinite relaxations, which are 
convex programs and can be solved efficiently by a standard SDP package 
like SeDuMi. In our computational experiments, the subproblems are all 
solved globally by GloptiPoly 3 [T5] and SeDuMi [ 35] . 

In [251 . the branch and bound method was implemented in C++, and the sub¬ 
problems were solved by BARON and GMAS. In our paper, the method is implemented 
in MATLAB, the subproblems are solved by GolptiPoly 3 and SeDuMi. Their ap¬ 
proaches and implementations are very different. It is hard to find a good way to 
compare them directly. However, for BPPs, the subproblems in |26j and in our 
paper are all polynomial optimization problems. To compare the two methods, it 
is reasonably well to compare the number of subproblems that are needed to be 
solved, although this may not be the best way. 

We choose the seven SBPPs in Example 5.3 which were also in [26]. The num¬ 


bers of subproblems are listed in Table 13 In the table, B & B (I) is the branch 
and bound method in [20 without branching; B & B (II) is the branch and bound 
method in |26j with branching; $TBD is the number of lower bounding subprob¬ 
lems; ^UBD is the number of upper bounding subproblems; $T-POP is the number 
of subproblems ( Pk ) needs to be solved in Algorithm 3.1 =#=U-POP is the number of 
subproblems (Qf) needs to be solved in Algorithm 


3.1 


The number of variables in 
lower bounding subproblems for branch and bound methods (I/II) and subproblem 
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(Pfc) for Algorithm |3.1 1 are the same, all equal to n + p ; and the number of variables 
in upper bounding sub prob lems for branch and bound methods (I/II) and subprob¬ 
lem (Qi) for Algorithm |3.l| are the same, all equal to p. For problem Ex. 3.16, since 
the subproblem () has two optimal solutions, so we need to solve two s ubp rob- 
lems (Qi) to check if they are both global optimal solutions. From Table 13 


one 


can see that Algorithm |3.1| has a smaller number of subproblems that need to be 
solved. If all the subproblems are solved by the same method, Algorithm |3.1| is 
expected to be more efficient. 


Table 13. A comparison of the numbers of polynomial optimiza¬ 
tion subproblems in [26j and in Algorithm 3.1 


Problem 

B & B (1) 

B & B (II) 

Alg. |3.1| 

#LBD 

#UBD 

#LBD 

#UBD 

#L-POP 

l^U-POP 

Ex. 3.14 

4 

3 

7 

3 

2 

2 

Ex. 3.15 

2 

1 

3 

1 

2 

2 

Ex. 3.16 

2 

1 

3 

1 

2 

3 

Ex. 3.17 

19 

18 

37 

18 

4 

4 

Ex. 3.18 

2 

2 

3 

2 

2 

2 

Ex. 3.19 

13 

12 

27 

14 

2 

2 

Ex. 3.20 

4 

3 

5 

3 

2 

2 
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